83 research outputs found

    Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns.

    Get PDF
    Biological processes are accomplished by the coordinated action of gene products. Gene products often participate in multiple processes, and can therefore be annotated to multiple Gene Ontology (GO) terms. Nevertheless, processes that are functionally, temporally and/or spatially distant may have few gene products in common, and co-annotation to unrelated processes probably reflects errors in literature curation, ontology structure or automated annotation pipelines. We have developed an annotation quality control workflow that uses rules based on mutually exclusive processes to detect annotation errors, based on and validated by case studies including the three we present here: fission yeast protein-coding gene annotations over time; annotations for cohesin complex subunits in human and model species; and annotations using a selected set of GO biological process terms in human and five model species. For each case study, we reviewed available GO annotations, identified pairs of biological processes which are unlikely to be correctly co-annotated to the same gene products (e.g. amino acid metabolism and cytokinesis), and traced erroneous annotations to their sources. To date we have generated 107 quality control rules, and corrected 289 manual annotations in eukaryotes and over 52 700 automatically propagated annotations across all taxa

    The IntAct molecular interaction database in 2012

    Get PDF
    IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct's data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www.imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intact

    IntAct—open source resource for molecular interaction data

    Get PDF
    IntAct is an open source database and software suite for modeling, storing and analyzing molecular interaction data. The data available in the database originates entirely from published literature and is manually annotated by expert biologists to a high level of detail, including experimental methods, conditions and interacting domains. The database features over 126 000 binary interactions extracted from over 2100 scientific publications and makes extensive use of controlled vocabularies. The web site provides tools allowing users to search, visualize and download data from the repository. IntAct supports and encourages local installations as well as direct data submission and curation collaborations. IntAct source code and data are freely available from

    Association of bovine leptin polymorphisms with energy output and energy storage traits in progeny tested Holstein-Friesian dairy cattle sires

    Get PDF
    peer-reviewedBackground: Leptin modulates appetite, energy expenditure and the reproductive axis by signalling via its receptor the status of body energy stores to the brain. The present study aimed to quantify the associations between 10 novel and known single nucleotide polymorphisms in genes coding for leptin and leptin receptor with performance traits in 848 Holstein-Friesian sires, estimated from performance of up to 43,117 daughter-parity records per sire. Results: All single nucleotide polymorphisms were segregating in this sample population and none deviated (P > 0.05) from Hardy-Weinberg equilibrium. Complete linkage disequilibrium existed between the novel polymorphism LEP-1609, and the previously identified polymorphisms LEP-1457 and LEP-580. LEP-2470 associated (P < 0.05) with milk protein concentration and calf perinatal mortality. It had a tendency to associate with milk yield (P < 0.1). The G allele of LEP-1238 was associated (P < 0.05) with reduced milk fat concentration, reduced milk protein concentration, longer gestation length and tended to associate (P < 0.1) with an increase in calving difficulty, calf perinatal mortality and somatic cells in the milk. LEP-963 exhibited an association (P < 0.05) with milk fat concentration, milk protein concentration, calving difficulty and gestation length. It also tended to associate with milk yield (P < 0.1). The R25C SNP associated (P < 0.05) with milk fat concentration, milk protein concentration, calving difficulty and length of gestation. The T allele of the Y7F SNP significantly associated with reduced angularity (P < 0.01) and reduced milk protein yield (P < 0.05). There was also a tendency (P < 0.1) for Y7F to associate with increased body condition score, reduced milk yield and shorter gestation (P < 0.1). A80V associated with reduced survival in the herd (P < 0.05). Conclusions Several leptin polymorphisms (LEP-2470, LEP-1238, LEP-963, Y7F and R25C) associated with the energetically expensive process of lactogenesis. Only SNP Y7F associated with energy storage. Associations were also observed between leptin polymorphisms and calving difficulty, gestation length and calf perinatal mortality. The lack of an association between the leptin variants investigated with calving interval in this large data set would question the potential importance of these leptin variants, or indeed leptin, in selection for improved fertility in the Holstein-Friesian dairy cow.Department of Agriculture, Food and Fisheries, Ireland - Research Stimulus Fund (RSF-06-0353; RSF-06-0409); Irish Dairy Research Trust; Teagasc Walsh Fellowshi

    Collaborative annotation of genes and proteins between UniProtKB/Swiss-Prot and dictyBase

    Get PDF
    UniProtKB/Swiss-Prot, a curated protein database, and dictyBase, the Model Organism Database for Dictyostelium discoideum, have established a collaboration to improve data sharing. One of the major steps in this effort was the ‘Dicty annotation marathon’, a week-long exercise with 30 annotators aimed at achieving a major increase in the number of D. discoideum proteins represented in UniProtKB/Swiss-Prot. The marathon led to the annotation of over 1000 D. discoideum proteins in UniProtKB/Swiss-Prot. Concomitantly, there were a large number of updates in dictyBase concerning gene symbols, protein names and gene models. This exercise demonstrates how UniProtKB/Swiss-Prot can work in very close cooperation with model organism databases and how the annotation of proteins can be accelerated through those collaborations

    Correction of the consequences of mitochondrial 3243A>G mutation in the MT-TL1 gene causing the MELAS syndrome by tRNA import into mitochondria

    Get PDF
    Mutations in human mitochondrial DNA are often associated with incurable human neuromuscular diseases. Among these mutations, an important number have been identified in tRNA genes, including 29 in the gene MT-TL1 coding for the tRNALeu(UUR). The m.3243A>G mutation was described as the major cause of the MELAS syndrome (mitochondrial encephalomyopathy with lactic acidosis and stroke-like episodes). This mutation was reported to reduce tRNALeu(UUR) aminoacylation and modification of its anti-codon wobble position, which results in a defective mitochondrial protein synthesis and reduced activities of respiratory chain complexes. In the present study, we have tested whether the mitochondrial targeting of recombinant tRNAs bearing the identity elements for human mitochondrial leucyl-tRNA synthetase can rescue the phenotype caused by MELAS mutation in human transmitochondrial cybrid cells. We demonstrate that nuclear expression and mitochondrial targeting of specifically designed transgenic tRNAs results in an improvement of mitochondrial translation, increased levels of mitochondrial DNA-encoded respiratory complexes subunits, and significant rescue of respiration. These findings prove the possibility to direct tRNAs with changed aminoacylation specificities into mitochondria, thus extending the potential therapeutic strategy of allotopic expression to address mitochondrial disorders

    The UniProt-GO Annotation database in 2011

    Get PDF
    The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360 000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set

    Diverse and Active Roles for Adipocytes During Mammary Gland Growth and Function

    Get PDF
    The mammary gland is unique in its requirement to develop in close association with a depot of adipose tissue that is commonly referred to as the mammary fat pad. As discussed throughout this issue, the mammary fat pad represents a complex stromal microenvironment that includes a variety of cell types. In this article we focus on adipocytes as local regulators of epithelial cell growth and their function during lactation. Several important considerations arise from such a discussion. There is a clear and close interrelationship between different stromal tissue types within the mammary fat pad and its adipocytes. Furthermore, these relationships are both stage- and species-dependent, although many questions remain unanswered regarding their roles in these different states. Several lines of evidence also suggest that adipocytes within the mammary fat pad may function differently from those in other fat depots. Finally, past and future technologies present a variety of opportunities to model these complexities in order to more precisely delineate the many potential functions of adipocytes within the mammary glands. A thorough understanding of the role for this cell type in the mammary glands could present numerous opportunities to modify both breast cancer risk and lactation performance

    Gene Ontology annotations and resources.

    Get PDF
    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources

    The Gene Ontology resource: enriching a GOld mine

    Get PDF
    The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations
    corecore